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Abstract 

In this paper, the proximal Gauss-Newton method for solving penalized nonlinear least 
squares problems is studied. A local convergence analysis is obtained under the assumption 
that the derivative of the function associated with the penalized least square problem satisfies 
a majorant condition. Our analysis provides a clear relationship between the majorant function 
and the function associated with the penalized least squares problem. The convergence for two 
important special cases is also derived. 
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^ ■ 1 Introduction 

m 

We consider the penalized nonlinear least squares problem 
XI' mini||F(x)f + J(x), (1) 

b ■ 

. . .' where X and Y are real or complex Hilbert spaces, fi C X an open set, F : il — )• Y is a contin- 

uously differentiable nonlinear function and J : $1 — >■ M U {+00} is a proper, convex and lower 
semicontinuous functional. A wide variety of applications can be found in mathematical program- 
ming literature, see for example [U [T5l I16j . In particular, if J{x) = 0, for all x € Q, the problem 
([1]) becomes the classical nonlinear least squares problem studied in [5l [6l [HI [9] . In this case, a 
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generalization of the Newton method caUed the Gauss-Newton method, can be used. This iterative 
algorithm computes the sequence 

Xk+i = Xk - F'{xk)^F{xk), A; = 0, 1, ... , 

where F'^Xk)' denotes the Moore-Penrose inverse of the linear operator F'{xk). 

In this paper, we consider the proximal Gauss-Newton method, introduced in [15], for solv- 
ing ([1]). This method extends the classical Gauss-Newton approach. It is defined as 

Xfc+i = proxj ^"''"^(xfc - F'{xk)'^F{xk)), /c = 0, 1, . . . , 

where proxj is the proximity operator associated to J (see [121 [131 [HI [15] ) with respect to the 
metric defined by the operator H{xk) '■= F'{xk)*F'{xk)- It shall be mentioned that the computation 
of the proximity operator is in general not straightforward and it may require an iterative algorithm 
itself, since, in general, a closed form is not available. 

The aim of this paper is to present a new local convergence analysis of proximal Gauss- 
Newton method under a majorant condition. This majorant formulation follows the ideas used 
in [3 [SI [71 [H [9] . This analysis provides a clear relationship between the majorant function, which 
relaxes the Lipschitz continuity of F' , and the nonlinear operator F associated with the penal- 
ized nonlinear least squares problem. Two majorant functions are also considered. In the first 
case, which corresponds to functions with Lipschitz derivative, the classical convergence results are 
recovered. The convergence analysis for analytical operators is discussed for the first time. 

The convergence of the sequence generated by the proximal Gauss-Newton method was also 
studied in [15]. There, instead of majorant function, the Wang's condition, introduced in [T9l [20] . 
is used for the analysis. In fact, it can be shown that these conditions are equivalent. However, the 
formulation as a majorant condition is better, due to it provides a clear relationship between the 
majorant function and the nonlinear function F under consideration. Furthermore, the majorant 
condition simplifies the proof of the obtained results. 

The organization of the paper is as follows. Next, we list some notations and a basic result used 
in our presentation. In Section [2l some results on Moore-Penrose inverse, proximity operators and 
the proximal Gauss-Newton algorithm are discussed. In Section [3l we state the main result and, 
for a better organization of the results, it is divided in three parts. First, some properties of the 
majorant function are established. Then in Subsection 13.21 we present the relationships between 
the majorant function and the nonlinear function F. Finally, in the last part our main result is 
proven. Section [4] is devoted to show the consequences of this result in particular cases. 

1.1 Notation and auxiliary results 

The following notations and results are used throughout our presentation. Let X and Y be Hilbert 
spaces. The open and closed balls in X with center a and radius r are denoted, respectively by 



B{a,r) and B[a,r]. For simplicity, given a; G X, we use the short notation 

a{x) := \\x — x*||. 

From now on, fi C X an open set, J : il — t- MU {+00} is a proper, convex and lower semicontinuous 
functional and F : fi — t- Y is a continuously differentiable function such that F' has a closed 
image in 0. We use £(X, Y) to denote the space of bounded linear operators from X to Y and Ix 
corresponds to the identity operator on X. Finally, if A S £(X,Y), then Ker{A) and im{A) are 
the kernel and image of A, respectively, and A* its adjoint operator. 

The following auxiliary results of elementary convex analysis will be needed: 

Proposition 1. Let e > and r G [0, 1] . /f 99 : [0, e) — t- M is convex, then 

• The function / : (0, e) — )• M defined by 

i{t) = !f^^l^-£^, 

is monotone increasing. 

. D+^{0) = lim„^o+ ^^^^^^ = mfo<. ^^^^Ml . 
Proof. See Theorem 4.1.1 and Remark 4.1.2, pp. 21 of |11] . D 

2 Preliminary 

In this section some results on Moore-Penrose inverse and proximity operators will be presented. 
Then, the algorithm to solve problem ([1]) and some properties related to it will be introduced. 

2.1 Generalized inverses 

In this section some results on Moore-Penrose inverse, will be presented. More details can be found 
in [IHIEI]. 

Let A £ £(X, Y) with a closed image. The Moore-Penrose inverse of A is the linear operator 
^t G £(Y,X) which satisfies: 

AA^A = A, A^AA^ = Al (AA^)* = AA^ {A^Ay=A^A. 

From the definition of the Moore-Penrose inverse, it is easy to see that 

A^A = lx- UKer{A) , AA^ = ^im{A) , (2) 

where He denotes the projection of X onto subspace E. 
If A is injective, then 

A^ = {A*A)-^A*, A^A = Ix, \\A^f = \\{A*A)-'^\\. (3) 

We end this part with a result concerning the variation of the pseudo-inverse, see |15 1 I18 1 [2T]. 



Lemma 2. Let A, B ^ £(X, Y) with closed images. If A is injective and ||At||||A — B\\ < 1, then 
B is injective and 

pt|| < ,, \f} ^, ||B.-At||<^ll-^'ll'll-*-^ll 

II II — 1 /It/1 D' II II — 



l.\\A\\\\\A-B\y " H-i_pt||p-B||- 

2.2 Proximity operators 

Proximity operators were introduced by Moreau and their use in signal theory goes back to 
We briefly recah some essential facts below and refer the reader to \V2\ [T3l [T5j for more details. 

Let i^ : X —7- X be a continuously, positive and selfajoint, bounded from below and, therefore, 
invertible operator. Then we have a new scalar product on X by setting (x, z)h = {x, Hz). Hence, 
the corresponding induced norm \\.\\h is equivalent to the given norm on X, since the following 
inequalities hold 
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The Moreau- Yosida aproximation of J with respect to the scalar product induced by H is the 
functional AIj : X — )• M deflned by setting 

Mj(z) = mf|j(x) + ^||x-z||2,|. (4) 

Recalling that J is a convex, lower semicontinuous and proper function J : X — )• M U {+oo}, it is 
easy to prove that the infimum of the last equation is attained at a unique point. Therefore, let us 
call proxj (z), the proximity operator associated to J and H 



(5) 



prox^ : X ^ X 

z H> Mj{z) = aTgm.m^f^^{j{x) + ^\\x - z\\jj} . 

Writing the first order optimality conditions for (j3|), we obtain that 

p = prox^(z) f^ G dJ{p) + H{p - z) -H- Hz £ {dJ + H){p), 

which using that the minimum in @ is attained at a unique point leads to 

proxf (z) = {dJ + H)-^{Hz). 

This part ends with an important property of proximity operator. 

Lemma 3. Let Hi and H2 he two continuous positive selfadjoint operators on X, both bounded 
from bellow. Then, 



\\prox'l'{zi)-proxf{z2)\\ < ^J\\Hi\\\\H{^\\\\zi-Z2\\ + \\H^^\\\\{Hi - H2){z2 - proxf{z2)\\, 
for every zi,Z2 G X. 
Proof. See Remark 4 in [K]. D 



2.3 The proximal Gauss-Newton method 

In this section we present the algorithm to solve ([T|) as well as some related properties. 

The goal of this method, introduced in [15], is to find stationary points of problem ([T]) as follows: 

Xk+i = prox^'^'"'\xk-F'{xk)^F{xk)), k = 0,l,..., (6) 

where H{xk) = F'{xk)*F'{xk) and proxj^'^*''^ is the proximity operator associated to J and H{xk) 
as defined in ^. 

Remark 1. As proved in Propositon 6 of \W^ . given Xk G X, if F'{xk) is injective with closed 
image, then Xk+i satisfies 

I 
Xk+i = argmm-\\F{xn) + F'{xk){x - Xk)\\'^ + J{x). 

This problem can be solved using first order methods for the minimization of nonsmooth convex 
functions, such as bundle methods or forward-backward methods (see fJl IJOj/]. We will use the 
proximal point formulation because the theoretical results of this area will be very useful for the 
proof of the convergence of the method. 

In the following, we establish the connection between the stationary point of the function defined 
in ([TJ and the fixed points of proximal point operator. 

Proposition 4. Let x* G O such that — F'(a;*)*F(x*) G dJ{x^), i.e., x* satisfies the first order 
conditions for local minimizers of ([T]). Assume that F'{x^) is injective and im{F'{x^)) is closed, 
then X* satisfies the fixed point equation 

x^ = proxj ^^*'{x^: - F'(x^,)^F(x*)), 

where H{x,) = F'(x*)*F'(x*). 

Proof. The proof follows the same ideas of the proof of Proposition 5 in [15], D 

3 Local analysis for the Gauss-Newton method 

Our goal is to state and prove a local theorem for the proximal Gauss-Newton method defined in 
([6]). First, we show some results regarding the scalar majorant function, which relaxes the Lipschitz 
condition to F' . Then, we establish the main relationships between the majorant function and 
the nonlinear function F. Finally we obtain that the Gauss-Newton method is well-defined and 
converges. The statement of the theorem is as follows: 



Theorem 5. Let il C X 6e an open set, J : fl — )• M U {+00} a proper, convex and lower semicon- 
tinuous functional and F : Q ^ Y a continuously differentiable function such that F' has a closed 
image in 0,. Let x* G ^2, i? > and 

c:= ||F(x*)||, f3:=\\F'{x,)^\\, k := P\\F'{x,)\\ 6 := sup {t € [0,R) : B{x^,t) C n} . 

Suppose that —F'{x^)*F{x^) G dJ{x^), F'(x*) is injective and there exists a continuously differen- 
tiable function / : [0, i?) — t- M such that 

(3 \\F'ix) - F'ix, + r(x - x,))|| < /' (a(x)) - /' (raix)) , (7) 

where x £ B{x:^, 5), t £ [0, 1] and a{x) = \\x — x*||, and 

hi) /(O) = and /' (0) = -1; 

h2) /' is convex and strictly increasing; 

h3) [{l + V2)K + l]c^D+f'{0) <l. 
Let be given the positive constants v := sup{t G [0, i?) : f'{t) < 0} , 

f,^,^ , [fit] + 1 + k] [tfit) - fit) + c/3(l + V2){f'{t) + 1)] + c/3 [fit) + 1] ^ ^ I 
p:=sup|tG(0,.): ^^^^^ < l| , 

r := niin{p, 5} . 

Given H{xf^) = F'[xk)*F'{xk), define proxj as the proximity operator associated to J and 

H[xk), see ([5]). Then, the proximal Gauss-Newton method for solving ([1]), with starting point 
xo e B{x^,r)/{x^] 

Xk+i=proxf'''\xk-F'ixk)^Fixk)), k = 0,l..., (8) 

is well defined, the generated sequence {xk} is contained in B{x^,r), converges to x* and 



[f'{a{xo)) + 1 + K][f'{a{xo))a{x o) - /(^(xq))] „ ,,2 

[aixo)na{xo))f 



i^k+i 2^*11 ji n77r^rT7737Z"Tv[2 w-^k x*\\ + 



(l + V2)/?c[r(a(xo)) + l]^ ,^ _ „2 , cl3[{l + V^)K + l][f'{a{xo)) + l] ,^ 

[a{xo)f'{a{xoW " ' " a{xo)[f'ia{xoW " ' "' ^' 

for all k = 0,1, ... . 

Remark 2. If J = 0, the proximal Gauss-Newton method becomes the classical Gauss-Newton 
method. However, with respect to the radius of the convergence ball this result does not correspond 
with the classical approach, see Theorem 7 of fd^. The reason is that the upper bound given in 
Lemma\^ is not affected if J = 0. 

6 



Remark 3. // the inequality in ([7]) holds only for t = 0, an analogous theorem is true. In fact, if 
the definition of p is replaced 

/ [fit) + 1 + K][trit) + fit) + 2t + c/3(l + V2)if'{t) + 1)] + cP[f'{t) + 1] ^ ^ I 

the well definition of proximity operator, the inclusion of the computed sequence in B(x*,r) and 
its convergence are guaranteed. In particular: 

II _ II ^ [f'{<T{xo)) + 1 + K][f'{a{xo))a{xo) + f{a{xo)) + 2a{xo)] „ _ 2^ 

[cr(xo)/'(cr(xo))]^ 
(1 + ^/2)/3c[f Kxq)) + 1]=^ _ 2 , c/3[(l + ^/2)K + l][r(a(xo)) + l] _ 

[a(xo)/'(a(:ro))]2 ""^'^ '^*ll + cT(xo)[/'(a(xo))]2 H"^^ '^*ll' 

/or aZ/ /c = 0, 1, . . . . 

As before, II{xk) = F'{xk)*F'{xk) and proxj is the proximity operator defined in ([5]). For 
the zero-residual problems, i.e., c = 0, Theorem [5] becomes: 

Corollary 6. Let ft CH be an open set, J : — )• M U {+00} a proper, convex and lower semicon- 
tinuous functional and F : fl ^ Y a continuously differentiable function such that F' has a closed 
image in il. Let x^, £ il., R > and 

^:=\\F'{x,)^\\, k:=P\\F'{x,)\\ 6 := sup {t £ [0,R) : B{x^,t) C Ct} . 

Suppose that F{x^) = 0, G dJ{x^:), F'(x*) is infective and there exists a continuously differentiable 
function / : [0, i?) — t- M such that 

P \\F\x) - F'{x, + t{x - x,))\\ < f {a{x)) - f {Ta{x)) , 

where x G B{x:^, 5), t £ [0, 1] and a{x) = \\x — x*||, and 

hi) /(O) = and /' (0) = -1; 

h2) /' is convex and strictly increasing. 
Let be given the positive constants v := sup{t G [0, i?) : f'{t) < 0} , 

f . s \f'{t) + ^ + lA\tf'{t)- fit)] 1 r e-. 

p := supjt G (0,z.) : ^-l-^ ^^^^^^^2 < l| ' '^ •= ^i^i/'' '^i • 

Then, the proximal Gauss-Newton method for solving ([T]), with starting point xq G i?(x*,r)/{x*} 

Xfc+i = prox^^''''^ [xk - F'{xk)^F{xk)) , A: = 0, 1 ... , 

is well defined, the generated sequence {xk} is contained in B{x^,r), converges to x* and 
II _ II ^ [ncj{xo)) + 1 + K][r(a(xo))a(xo) - /(a(xo))] „ _ 2 ,_.. 



In order to prove Theorem [5] we need some results. From now on, we assume that ah the 
assumptions of Theorem [5] hold. 

3.1 The majorant function 

Our first goal is to show that the constant 5 associated with J7 and the constants ly and p associated 
with the majorant function / are positive. Also, we will prove some results related to the function /. 
We begin by noting that 6 > 0, because Q is an open set and x* G J7. 

Proposition 7. The constant v is positive and f'{t) < for all t £ (0, v). 

Proof. As /' is continuous in (0, i?) and /'(O) = —1, there exists e > such that f'{t) < for 
all t G (0,e). Hence, u > 0. Now, using h2 and definition of v the last part of the proposition 

follows. n 

Proposition 8. The following functions are positive and increasing: 

i) [0, v)3t^ -i//'(t); 

ii) %y)3t^-[nt) + l + K]/f'{tY 
iii) (0, z.)9t^[tr(t)-/(t)]/i2; 

iv) (0, v)3t^ [fit) + l]/t. 
As a consequence, 

(0 ,),,,, [f'{t) + i+.]mt)-fit)] [fit) + 1]^ (ou)3t^i^^^l±l 

are also positive and increasing functions. 

Proof. Items i and ii are immediate, because hi, h2 and Proposition [7] imply that /' is strictly 
increasing and — 1 < /'(t) < for all t G [0, z^). 

Now, note that after some simple algebraic manipulations we have 

tfjt) - fit) ^ r' fit) - f'irt) ^^ 
t^ Jo t 

Hence, as /' is strictly increasing (h2), we obtain that the function of item iii is positive. Moreover, 
combining the last equation and Proposition [1] with f = cp and e = z^, we conclude function of 
item iii is increasing. So, item iii is proved. 

Assumption hi and h2 imply that the function of item iv is positive. Hence, to conclude 
item iv use h2, /'(O) = —1 and Proposition [1] with f = (p, e = v and r = 0. 

To prove that the functions in the last part are positive and increasing combine items i, ii and 
iii for the first function and items i and iv for the second and third functions. D 



Proposition 9. The constant p is positive and there holds 



[fit) + 1 + K] [tfjt) - fit) + c/3(l + V2){f'{t) + 1)] + c/3 [fit) + 1] ^ ^ 



Proof. First, using hi and some algebraic manipulation gives 



tf'it) - fit) 



^^^' t-o 



fit) + i fit) -no) 

t t-o ' 



Since /'(O) = —1 and /' is convex, last equations and Proposition [T] lead to 
lim [tf'it) - fit)]/t = 0, lim [fit) + l]/t = D+/'(0), 

which, combined with /'(O) = — 1 and simples calculus yields 

[fit) + 1 + /^] [tf'it) - fit) + cfSjl + V2)if'it) + 1)] + c/3 [fit) + 1] 
tTo t[/'(t)]2 

= Kc/3(1 + V2)D+fiO) + cpD+f iO) = c/3[(l + V2)k + l]D+fiO). 

Now, using h3, i.e., [(1 + \/2)k + l]c/3D+/'(0) < 1, we conclude that there exists a e > such that 

[fit) + 1 + K] [tfit) - fit) + c/3(l + V2)ifit) + 1)] + c/3 [fit) + 1] ^ ^ 
WOF ^ ' ^^''^' 

So, e < p, which proves the first statement. 

To conclude the proof, we use the definition of p and Proposition El D 

3.2 Relationship of the majorant function with the non-hnear function F 

In this part we will present the main relationships between the majorant function / and the function 
F associated with the problem ([1]). As usual (t(x) = ||x — x,,||. 

Lemma 10. Let x G ft. If (t(x) < min{z^, (^}, then Hix) = F'ix)*F'ix) is invertible and the 
following inequalities hold 

iirw'ii < ^^, iirw. - r(.,).|i < -v/^w-w;)) + ^ . 

f'iaix)) fi(^ix)) 

In particular, Hix) = F'ix)*F'ix) is invertible in Bix^,r). 



Proof. Let x G il such that a{x) < iiim{i>,5}. Since a{x) < u, by Proposition [TJ f'{a{x)) < 0. 
Using the definition of /3, the inequahty ([7]) and hi we have 

||F'(x,)t||||F'(x) - F'{x,)\\ = P\\F'{x) - F'(x.)|| < f'{a{x)) - /'(O) = f'{a{x)) + 1< 1. 

Taking account that F'{x^) is injective, in view of Lemma [H F'{x) is injective. So, H{x) is 
invertible. Moreover, again by Lemma [2] 

llF'fxltll < ll^^(^*)^ll 11^'^, )t_^'.,)t|| < V2\\F'{x^W\\F'ix,)-F'{x)\\ 

II ^-^^ II - 1 - \\F'{x,y\\\\F'{x,) - F'{x)\\ ' " ^ *' ^ ^ II - 1 - \\F'{x,)m\\F'{x,) - F'{x)\\ ' 

Now, using the definition of /3, inequahty d?]), hi and that cr{x) < v^ we have 

1 < 1 -1 



l-||F'(x,)t||||F'(x)-F'(x,)|| - l-(/'(a(x))-/'(0)) /'K^))' 

Thus, combining the last inequahties, we obtain the desired bounds for ||F'(x)^^|| and 
||-F'(x)T — F'(x*)'||. The last part follows by noting that r < min{z^, (5}. D 

To prove the convergence of sequence {x^} on Theorem [5l the following relations will be needed. 

Lemma 11. Let x G il. If a{x) < min{z^, 5}, then 

i) \\Hix)\\y^<[f'{a{x)) + l + K]/(3; 

ii) \\Hix)-Y^'<-/3/[f'iaix))]; 

hi) mH{x) - H{x.))F'{x,)m < (f (a(x)) + 2 + ^){f'{a{x)) + 1). 

Proof. First, simple calculus, inequality in ([7|) and definitons of /? and k, gives 

^||F'(x)|| = ||F'(x.)t||||F'(x)|| < /3||F'(x) - F'(x,)|| + /3||F'(x,)|| < /'(a(x)) + I + k. (10) 

As ||i?(x)f /2 = ||F'(x)*F'(x)||V2 = ||F'(x)||, the first statement follows. 

Now, to show item ii, use definition H, last inequality in ([3]) and Lemma [TOl 
For iii, note that the definition H, some algebraic manipulations and ^ gives 

mH{x) - F(x*))F'(x,)t|| = /3||F'(x)*(F'(x) - F'(x,))F'(x,)t + (F'(x) - F'(x,))*n,^(p,(,.))|| 

<(||F'(x)||||F'(x.)t|| + l)/3||F'(x)-F'(x.)||, 

which, combined with (llOp and inequality in ([7]) imply the desired statement. D 

Remark 4. Note that in the Lemmas [721 and \lll we have used the fact that condition ^ holds 
only for r = 0. 

10 



It is convenient to study the linearization error of -F at a point in il., for which we define 

EF{x,y):=F{y)-[F{x)+F'{x){y-x)], y, x e 0. (11) 

We will bound this error by the error in the linearization on the majorant function / 

ef{t,u):=f{u)-[f{t) + f'{t){u-t)], t,ue[0,R). (12) 

Lemma 12. Letx^Cl. If a{x) < 6, then l3\\Ep{x,x^:)\\ < ef{a{x),0). 

Proof. Since B{x^,6) is convex, we obtain that x* + r(x — x*) G B{x^:,6), for < r < 1. Thus, as 
F is continuously differentiable in fi, the definition of Ep and some simple manipulations yield 

P\\Ef{x,x^)\\< / /3||[F'(x)-F'(x* + t(x-x,))]|| ||x, -x|| dr. 
Jo 

From the last inequality and the assumption ([7]), we obtain 

f3\\EFix,x,)\\< f [f'{a{x))-nTa{x))]a{x)dT. 
Jo 

Evaluating the above integral and using the definition of ej, the statement follows. D 

Remark 5. // the inequality in ([7]) holds only for r = 0, then the upper bound of P\\Ep(x,x^:)\\ in 
the previous Lemma becomes ef{a{x),0) + 2(/(o"(x)) + cr{x)). 

In particular, Lemma 1101 guarantees that H(x) = F'{x)*F'{x) is invertible in B{x^,r) and 
consequently, F'{xy and proxj are well defined in this region. Hence, the proximal Gauss- 
Newton iteration map is also well defined. Let us call Qp, the proximal Gauss-Newton iteration 
map for F in that region: 

gF:B{x.,r) ^ X 

X I— >• proXj \(jp\x)) 

where, 

}i{x) = F'{x)*F'{x), Gf{x) = x- F'{x)^F{x). (14) 

Take x G B{x^,,r). Note that the point computed by the proximal Gauss-Newton iteration, Qp{x), 
may not be an element of B{x^,r)^ or may not even belong to the domain of F. To ensure that the 
Gauss-Newton iterations may be repeated indefinitely, this is enough to guarantee that the method 
is well defined for one iteration, as we will show in the following result. 
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Lemma 13. Let x G il. // a{x) < r, then Qp is well defined and there holds 

\\Qf[.x) - X* < \ ( \ff( ( NM2 ^ ~^*\\ + 

[cj(x)/'(cj(x))]^ 
{l + V2)cl3[ncj{x)) + lY _ 2 , c/3[(l + ^/2)K + l][f(a(x)) + l] _ 

[a{x)na{xW ""^^ '^*" a(x)[/'(a(x))]2 ^^ ^^H' ^^'^ 

/n particular, 

ll^ir(x) — X=|,|| < ||x — X=k||. (16) 

Proof. First, as ||x — x*|| < r, it follows from Lemma [TOl that H{x) = F'{x)*F'{x) is invertible; 
then Gf{x) and Qf{x) are well defined. Now, as — F'(x*)*-F(x,,) G 9J(x*) and F'(x*) is injective, 
it follows from Proposition HJ ()13p and ()14p that x* = proxj ^^*'(Gi;'(x^)). Hence, 

II^f(x) - x^,|| = ||proXj^'^^(GF(x)) -prox_^^^'''(GF(x*))||, 
which, combined with Lemma [3] yields 

\\Qf{x) - x,|| < {\\H{x)\\\\H{xr^\\f'^\\GF{x) - Gf{xM 

+ \\H{x)-^\\\\{H{x)-H{xMGF{x.)-wo^f''*\GF{xM\- 
Using X* = proxj (Gi7'(x*)) and (fT4|l . the last inequality becomes 

\\Qf{x) - x,|| < {\\H{x)\\\\H{x)-^\\fl^GF{x) - Gf{x,)\\ 

+ \\H{x)-^\\\\{H{x) - F(x*))F'(x*)^||||F(x*)||. 

For simplicity, the notation defines the following terms: 

A{x,x,) = {\\H{x)\\\\H{xr'\\fl'\\GF{x)-GF{x.)\\ (17) 

and 

5(x,x,) = \\H{x)-^\\\\{H{x) - H{x^))F'{x^)^\\\\F{x^)\\. (18) 

So, from the three latter inequalities we have 

||Gf(x)-x*|| <^(x,x*) + B(x,x*). (19) 

Now, we will obtain upper bounds of A{x,x*) and B{x,x*). First, some algebraic manipulations 
and definitions in (jlip and (|14p yield 

||Gf(x) - G^(x,)|| = ||F'(x)t[F'(x)(x - X*) - F(x) + F(x*)] + (F'(x,)t - F'(x)t)F(x*)||. 
< ||F'(x)t||||Fi.(x,x,)|| + ||F'(x,)t - F'(x)t||||F(x,)||. 
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Combining last inequality, Lemmas [TO] and [12] and definition of c, we have 

\\Gp{x) - Gp{x^)\\ - -rji^^ + ^fl^i^) • 

So, the definition in p7p . last inequality and Lemma [TTl-ii imply 

A{x, X*) < ^'^\ffj^^^l^2^ '^ {^M^l 0) + ^/2c/3[/'(a(x)) + 1]) . (20) 

On the other hand, from definition in (|18|) . items ii and iii of Lemma [IT] we have 

B{x, X,) < [^.(jf^))p (/('^(^)) + 2 + ^)(/(^(x)) + 1). (21) 



Hence, ([T9]), ([20]) and dH]) imply 



r{a{x)) + l + K]ef{a{x),Q) {I + V2)c^[f\a{ x)) + if 

cH [(1 + V2)k + 1] [/'(a(x)) + 1] 



[^ yt^yg.;; -r -L -r n,jo^l^ui^j.;,u; yx -p y ^;o^ [^ V"V-^;; T -■-] , 



[/'(^(x))]^ 



which, combined with (I12p . hi and simple manipulation yields to (llSp . 

To end the proof first, note that the right-hand side of (|15p is equivalent to 



[f{a{x)) + 1 + k] [a{x)f{a{x)) - fjajx)) + c/3(l + V2){f'{a{x)) + 1)] + c/3 [f (a(x)) + 1]' 

a{x)[f'{a{xW 



r{x). 



On the other hand, as x E i?(x*,r)/{x*}, i.e., < a{x) < r < p we apply the Proposition [9] with 
t = a{x) to conclude that the quantity in the bracket above is less than one. So, (JTOj) follows. 

D 

Remark 6. If the inequality in ([7]) holds only for r = 0, then (llSp becomes 

11^ . ^ II . [f (a(x)) + 1 + K] [f'{a{x))a{x) + /(a(x)) + 2a(x)] „^ _ 2^ 

ll^i^(^) ^*\\ < [a(x)/'(a(x))]2 ll"^ ''*ll + 

(l + ^/2)c/3[r(a(x)) + l]^ _ .a , c/3[(l + ^/2)K + 1] [f (a(x)) + 1] _ 

[a(x)/'(a(x))]2 ll^'^ ^*ll + a(x)[/'(a(x))]2 ^"^ ^^H" 
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3.3 Proof of Theorem [5] 

First of all, note that the equation in ^ together p^ and ^^ imply that the sequence {xk} 
satisfies 

Xk+i = GF{xk), k = 0,l,.... (22) 

Proof. Since xq G B{x^,r)/{x^}, i.e., < <t(xo) < r, by a combination of Lemma [TOl the last 
inequality in Lemma [13] and an induction argument it is easy to see that {xk} is well defined and 
remains in i?(x*,r). 

Now, our goal is to show that {xk} converges to x^^. As {xk} is well defined and contained in 
B{x^:,r), ()22p and Lemma [13] leads to 



[f'{a{xk)) + 1 + K][f'{a{xk))a{x k) - /(^(xfc))] „^ ^ ,,2 

WiXk)f{(^{Xk))f 



|2^fc+l Xuf\\ f^ TZTTI \~F77~I7Z. v\i2 W-^k 2;^|| -|- 



(1 + x/2)c/3[r(a(a;fc)) + 1]2 _ 2 . c/3[(l + ^/2)k + l][f (a(xfc)) + 1] _ 

k(x,)/'(a(x,))]2 ll^'^ ^*ll + cT(xfc)[/'(a(x,))]2 "^^ ^^H' 

for all /c = 0, 1, Using again ([22]) and the second part of Lemma [TS] it is easy to conclude that 

a{x) = \\xk — x*|| < ||xo — 2;*|| = o-{xo), k = 1,2 ... . (23) 

Hence, by combining the last two inequalities with the last part of Proposition [8] we obtain that 



[fiaixo)) + 1 + k][/V(^o)M^o) - f{<T{xo))] „ „2 

[a{x,)f'{a{x,))f 



I II -^ '■'' ^ \ ^// ' ' \iJ \ \^yj//^ v^u/ J \^ v^uy/j II ||Z I 



{l + V2)cP[f\a{x^)) + l]\ _ 2 , c/3[(l + ^/2)/t + l][r(a(xo)) + l] _ 

k(xo)/'(cT(xo))]2 11'^'= '^*ll + a{x,)[f'{a{x,W ""^^ '^*"' 

for all A; = 0, 1, . . ., which is the inequality ^. Now, combining last inequality with (j23p we obtain 

■[/'(a(xo)) + l + n] [/'(a(xo))^(xo) - /(a(xo)) + (1 + V2)c/3(/' (a(xo)) + 1)] 



l-^/c + l ■^*|| _ 



a{xo)[f'{a{x,))Y 

c/3[r(a(xo)) + l] 



-+ 



I "^k '*^* I 



a(xo)[/'(a(xo))]2. 

for all A; = 0, 1, . . .. Applying Proposition [9] with t = cf{xq) it is straightforward to conclude from 
the latter inequality that {\\xk — x^\\} converges to zero. So, {xk} converges to x*. D 

4 Special cases 

In this section, we present two special cases of Theorem [5] The convergence theorem for proximal 
Gauss-Newton method under Lipschitz condition and Smale's theorem on proximal Gauss-Newton 
for analytic functions are included. 
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4.1 Convergence result for Lipschitz condition 

In this section we show a correspondent theorem for Theorem [5] under Lipschitz condition, instead 
of the general assumption ([7]) . 

Theorem 14. Let Vl <Z ^ he an open set, J : — )• M U {+00} a proper, convex and lower 
semicontinuous functional and F : O — )• Y a continuously differentiable function such that F' has 
a closed image in 0,. Let x* G ^2, i? > and 

c:= ||F(x*)||, P:=\\F'{x,y\\, k := /3 ||F'(x,)|| 5 := sup {t e [0,R) : B{x^,t) C Q} . 

Suppose that — F'(x*)*-F(a;*) G dJ{x^), F'(x*) is infective and there exists a L > such that 

h:=[{l + V2)K + l]c^L<l, /3||F'(x)-F'(x, + r(x-x*))|| <L(l-r)a(x), (24) 

where x £ B^x^^, 5), r G [0, 1] and a{x) = \\x — x*||. Let 



r := mm 



4 + K + 2c(l + \/2)/3L - J(4 + K + 2c(l + \/2)/3L)2 - 8(1 - h) 



2L 
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Then, the proximal Gauss-Newton method for solving ([T]), with starting point xq G i?(x*, r)/{x*} 

Xfc+i = proxy'"''' [xk - F'{xk)^F{xk)) , k = 0,l... , 

is well defined, the generated sequence {x^} is contained in B{x^,r), converges to x* and 

II _ II ^ At^ + 2c(l + V2)I3L^ + LV(xo) „ _ „2 , [(1 + 72)k + 1]c/3L „ _ 

||Xfc+i x,||S 2[l-La(xo)]2 "'"= ^*" + [l-Mxo)]2 """^ ^*"' ^ ^ 

/or a// A; = 0, 1, ... . 

Proof. It is immediate to prove that F , x^, and / : [0, (5) — )• M defined by /(t) = Li^/2 — t, satisfy 
the inequahty d?]), conditions hi and h2. Since [(1 + \pi)K + l]c/3L < 1, the condition h3 also 
holds. In this case, it is easy to see that the constants i' and p as defined in Theorem [51 satisfy 



4 + K + 2c(l + V2)/3L - JU + K + 2c(l + V2)pLY - 8(1 - h) 
< p = ^- — <v = 1/L, 

as a consequence, < r = min{5, p}. Therefore, as F, J, r, / and x* satisfy all of the hypotheses of 
Theorem [5l taking xq G B{x.f,r)\{x.f} the statements of the theorem follow from Theorem [5j D 
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Remark 7. // the second inequality in ()24p holds only for r = 0, then an analogous theorem holds 
true. More specifically, if we replace the definition of r in above theorem by 



r := mm < 



-(4 + 3k + 2c(1 + V2)f3L) - ^(4 + 3k + 2c(l + \/2)/3L)2 + 8(1 - h) 

2L '"^^^ 



then all the statements of the previous theorem are valid with exception of inequality (j25p . which in 
this case becomes 



3kL + 2c (l + V2)pL^ + 3LV(xo) 2 [(1 + \/2)k + l]c /JL , 



|3^fc+l 3^*11 S rin r _/'„ \12 ll'^^fc X^,\\ + n r ^ M2 W-^k ■^*||) 



for all k = 0,1, ... . 

For the zero-residual problems, i.e., c = 0, the Theorem 1141 becomes: 

Corollary 15. Let Q (^ ^ be an open set, J : ^2 — )• M U {+00} a proper, convex and lower 
semicontinuous functional and F : il — )■ Y a continuously differentiable function such that F' has 
a closed image in il. Let x* € fi, i? > and 

f3:=\\F'{x,y\\, k:=P\\F'{x,)\\ 6 := sup {t € [0,R) : B{x^,t) C Ct} . 

Suppose that -F(x*) = 0, G (9J(x*), F'{x^:) is infective and there exists a L > such that 

/3 \\F'{x) - F'(x* + t{x - x,))\\ < L(l - T)a{x), 

where x £ B{x:^, 5), t £ [0, 1] and a{x) = \\x — x*||. Let 



4 + K- V(4 + K, 



2 



r := min < , 6 } . 

I 2L 



Then, the proximal Gauss-Newton method for solving ([T]), with starting point xq G i?(x*, r)/{j;*} 

Xfc+i = proxy'"''' {xk - F'{xk)^F{xk)) , A: = 0, 1 ... , 

is well defined, the generated sequence {xk} is contained in B{x^,r), converges to x* and 



_ II / KL + L'^a{xo) _ 2 
2[1 - La{xQ)\^ 



for all k = 0,1, ... . 
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4.2 Convergence result under Smale's condition 

In this section we present a correspondent theorem to Theorem [5] under Smale's condition. For 
more details, see [2l|3l|T7]. First we will prove two auxiliary Lemmas. 

Lemma 16. Let ^2 C X 6e an open set, F : — )• Y an analytic function and 

l/(n-l) 



7 := sup/3 

n>l 



F(")(a;,) 



nl 



< +00, 



(26) 



where j3 := ||F'(x*)^||. Suppose that x* G il and i?(x*,l/7) C il. Then, for all x G i?(x*,l/7) 
there holds 

/3||F"(x)K(27)/(l-7lk-^*ll)'. 



Proof. The proof follows the same pattern as the proof of Lemma 21 of [6]. 



D 



The next result provides a condition which is easier to check than ([7]) , for two-times continuously 
differentiable functions. 

Lemma 17. Let Q (^\ be an open set, x=k G and F : $7 — )• Y be twice continuously differentiable 
on O. // there exists a twice continuously differentiable function f : [0,i?) — >• M such that 



/3||F"(x)|K/"(||x-x.||), 
for all X G B{x^,,R) D fi, then F and f satisfy ([7]). 
Proof. The proof follows the same pattern as the proof of Lemma 22 of [6] . 



(27) 



D 



Theorem 18. Let fi C X 6e an open set, J : il — )• M U {+00} a proper, convex and lower 
semicontinuous functional and F : — )■ Y an analytic function such that F' has a closed image 
in 0,. Let x^, G 0,, R > and 

c:= ||F(a;*)||, /3 := ||F'(x*)t||, k := /3 ||F'(x,)|| 6 := sup {t e [0, R) : B{x^,t) C Q} . 

Suppose that — F'(a;*)*F(x*) G dJ{x^), F'(x*) is injective and 

/i = 2c7/3[(1 + ^/2)k + 1] < 1, 



recall that 7 := sup„~^;^ /3 



F(")(a:,) 



l/(n-l) 



< +00. Let the constants a = 7c/3, 6 = (1 + v2)7c/3, 



p := inf <^ s G (V2/2, 1) : p{s) := -4s^+{l-K+a+b{K-l))s'^+{3+K+a+b{K-l))s''+{b-l)s+b < 



(28) 
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r := min{(l — p)/7, 5} . 
Then, the proximal Gauss-Newton method for solving ([1]), with starting point xq G i?(x=i,,r)/{a;*} 

Xfc+i = proxj ^''''' {xk - F'{xk)'^F{xk)) , k = 0,l... , 

is well defined, the generated sequence {xk} is contained in B{x^,r), converges to x^ and 

_ „^ i + (K-i)(i-^a(xo))2 ^, _ „2 , (l + ^/2)/3c7^(2-7a(xo))2 „ _ 2 

c/3[(l + V2)/t + 1](2 - 7^(^o))(l - I'yjx^)? II _ II .... 
[l-2(l-7a(xo))2]2 11'^'= ^*ll' ^^^^ 

/or a// A; = 0, 1, ... . 

Proof. Consider the real function / : [0, 1/7) — )• M defined by 

/(t) = J^ - It. 
1 — 7t 

It is straightforward to show that / is analytic and that 

/(0) = 0, /'(t) = l/(l-7t)2_2, /'(0) = -l, r(t) = (27)/(l-7t)', r(0)=n!7"-\ 

for n > 2. It follows from the last equalities that / satisfies hi and h2. Since h = 27c/3[(l + 
\/2)k + 1] < 1, the condition h3 also holds. Now, as f"{t) = (27)/(l — 7t)'^ combining Lemmas [T6l 
andll7t we conclude that F and / satisfy ([7D with R = I/7. In this case, 

iy={2- \/2)/27 < 1/7. 

Now, we will obtain the constant p as defined in Theorem [5l For simplicity, consider the following 
change of variable 

s = 1 — 7f . 

Then, i = (1 - s)/j. Moreover, if t satisfies 0<t<z^ = (2- V^)2j, then \/2/2 < s < 1. Hence, 
determine the constant p as defined in Theorem [5] is equivalent to determine the constant s such 
that 



p = inf <^ s G (V2/2, 1) : p{s) = -4s^+(l-K+a+6(K-l))s'^+(3+K+a+6(K-l))s^ + (6-l)s+6 < ^, 

where a = 7c/3 and 6 = (1 + v2)7c/3. Thus, taking in account the change of variable, we have 

P = (1 - P)/7 and 

r = min{(l — p)/j, 5} . 

Therefore, as F, J, r, f and x* satisfy all hypothesis of TheoremEJ taking xq G i?(x*,r)\{x*}, the 
statements of the theorem follow from Theorem [5j D 

18 



Remark 8. Fixed the numerical values of a, b and k, as p{\) = /i — 1 < 0, it is easy to compute 
p, defined in (f28]l . Moreover, as f{t) = t/(l — 7t) — 2t is the majorant function, by Proposition\Bi 
it follows that p is decreasing in (\/2/2, 1). 

For the zero-residual problems, i.e., c = 0, the Theorem 1181 becomes: 

Corollary 19. Let 17 C X 6e an open set, J : fi — )• R U {+00} a proper, convex and lower 
semicontinuous functional and F : Q ^ Y an analytic function such that F' has a closed image 
in il. Let x* G il, i? > and 

f3:=\\F'{x,)^\, k:= /3\\F'{x,)\\ 6 := sup {t £ [0,R) : B{x^,t) C n} . 

Suppose that -F(x*) = 0, e dJ{x^), F'{x:^) is infective and 

l/(n-l) 



7 := sup/3 

n>l 



_p(«)( 



n\ 



< +00. 



Let be given the positive constants 

p:=milse (\/2/2, 1) : p{s) := -4s^ + {1 - k)s'^ + {3 + k)s - I < ol, r := min{(l - ^)/7, 5}. 

Then, the proximal Gauss-Newton method for solving ([1]), with starting point xq £ i?(x*,r)/{x*} 

Xk+i = proxy'"''' {xk - F\xk)^F{xk)) , A; = 0, 1 ... , 

is well defined, the generated sequence {x^} is contained in B{x^,r), converges to x* and 



1 + (k-1)(1-7^(xo) )-„^ ^,,2 
[l-2{l-^a{x,)n 



I _ II ^ ^ '^ i^K-^'jjj II _ \\2 jt, _ n 1 

l^fc+l Xif\\ ^ |.^ ^^^ ^^ ^^2l2 W'^k x*\\ ^ "- — U, i, . . . . 



5 Final remark 



Under a majorant condition, we present a new local convergence analysis of the proximal Gauss- 
Newton method for solving penalized nonlinear least squares problem. It would also be interesting to 
present a semi-local convergence analysis of the proximal Gauss-Newton method, under a majorant 
condition, for the problem on consideration. This local analysis will be performed in the future. 
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